AITopics | weight regularization

Collaborating Authors

weight regularization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

2 Method Notations We use X>, X1, Tr(X) and vec(X) to denote the transpose, inverse, trace, and column-wise vectorization of amatrixX. We use X Y to represent the Kronecker product

Neural Information Processing SystemsFeb-11-2026, 18:26:33 GMT

In contrast, artificial agents are prone to'catastrophic forgetting' whereby performance on previous tasks deteriorates rapidly as new ones are acquired. This shortcoming has recently been addressed using methods that encourage parameters tostay close tothose used forprevious tasks.

artificial intelligence, continual learning, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

ca4b5656b7e193e6bb9064c672ac8dce-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 09:01:46 GMT

architecture, child architecture, controller, (14 more...)

Neural Information Processing Systems

Country:

Asia > Taiwan (0.06)
North America > United States (0.05)
North America > Canada (0.05)

Genre: Instructional Material > Online (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

NICE: NoIse-modulated Consistency rEgularization for Data-Efficient GANs Y ao Ni

Neural Information Processing SystemsFeb-9-2026, 12:37:49 GMT

Generative Adversarial Networks (GANs) are powerful tools for image synthesis.

artificial intelligence, discriminator, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

05cdc7feee41e3572a9a3f4acb773891-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 09:54:28 GMT

Conversely, the opposite scenario is relatively benign.

artificial intelligence, machine learning, similarity, (16 more...)

Neural Information Processing Systems

Country:

South America > Peru > Lima Department > Lima Province > Lima (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Natural continual learning: success is a journey, not (just) a destination

Neural Information Processing SystemsDec-25-2025, 04:21:21 GMT

Biological agents are known to learn many different tasks over the course of their lives, and to be able to revisit previous tasks and behaviors with little to no loss in performance. In contrast, artificial agents are prone to'catastrophic forgetting' whereby performance on previous tasks deteriorates rapidly as new ones are acquired. This shortcoming has recently been addressed using methods that encourage parameters to stay close to those used for previous tasks. This can be done by (i) using specific parameter regularizers that map out suitable destinations in parameter space, or (ii) guiding the optimization journey by projecting gradients into subspaces that do not interfere with previous tasks.

journey, name change, natural continual learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)

Add feedback

05cdc7feee41e3572a9a3f4acb773891-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 17:36:28 GMT

regularization, retention performance, similarity, (14 more...)

Neural Information Processing Systems

Country:

South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > United States > Missouri > St. Louis County > St. Louis (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On the Stochastic Stability of Deep Markov Models

Neural Information Processing SystemsOct-9-2025, 16:19:14 GMT

This section proposes additional regularization methods for learning stable deep Markov models. The most direct approach is to include the stability conditions as extra penalties in the DMM loss function.

artificial intelligence, deep markov model, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
North America > United States > Washington > Benton County > Richland (0.05)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.64)

Add feedback

2c8047bf3ed8ef6905351608d641f02f-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 09:04:59 GMT

discriminator, noise, regularization, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

Model Guidance via Robust Feature Attribution

Ghitu, Mihnea, Piratla, Vihari, Wicker, Matthew

arXiv.org Artificial IntelligenceSep-23-2025

Controlling the patterns a model learns is essential to preventing reliance on irrelevant or misleading features. Such reliance on irrelevant features, often called shortcut features, has been observed across domains, including medical imaging and natural language processing, where it may lead to real-world harms. A common mitigation strategy leverages annotations (provided by humans or machines) indicating which features are relevant or irrelevant. These annotations are compared to model explanations, typically in the form of feature salience, and used to guide the loss function during training. Unfortunately, recent works have demonstrated that feature salience methods are unreliable and therefore offer a poor signal to optimize. In this work, we propose a simplified objective that simultaneously optimizes for explanation robustness and mitigation of shortcut learning. Unlike prior objectives with similar aims, we demonstrate theoretically why our approach ought to be more effective. Across a comprehensive series of experiments, we show that our approach consistently reduces test-time misclassifications by 20% compared to state-of-the-art methods. We also extend prior experimental settings to include natural language processing tasks. Additionally, we conduct novel ablations that yield practical insights, including the relative importance of annotation quality over quantity. Code for our method and experiments is available at: https://github.com/Mihneaghitu/ModelGuidanceViaRobustFeatureAttribution.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.1968

Country: Europe (0.67)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Health & Medicine > Diagnostic Medicine > Imaging (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Yet Unnoticed in LSTM: Binary Tree Based Input Reordering, Weight Regularization, and Gate Nonlinearization

Moattari, Mojtaba

arXiv.org Artificial IntelligenceSep-3-2025

LSTM models used in current Machine Learning literature and applications, has a promising solution for permitting long term information using gating mechanisms that forget and reduce effect of current input information. However, even with this pipeline, they do not optimally focus on specific old index or long-term information. This paper elaborates upon input reordering approaches to prioritize certain input indices. Moreover, no LSTM based approach is found in the literature that examines weight normalization while choosing the right weight and exponent of Lp norms through main supervised loss function. In this paper, we find out which norm best finds relationship between weights to either smooth or sparsify them. Lastly, gates, as weighted representations of inputs and states, which control reduction-extent of current input versus previous inputs (~ state), are not nonlinearized enough (through a small FFNN). As analogous to attention mechanisms, gates easily filter current information to bold (emphasize on) past inputs. Nonlinearized gates can more easily tune up to peculiar nonlinearities of specific input in the past. This type of nonlinearization is not proposed in the literature, to the best of author's knowledge. The proposed approaches are implemented and compared with a simple LSTM to understand their performance in text classification tasks. The results show they improve accuracy of LSTM.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.00087

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback